High Performance LDA through Collective Model Communication Optimization
نویسندگان
چکیده
منابع مشابه
High Performance LDA through Collective Model Communication Optimization
LDA is a widely used machine learning technique for big data analysis. The application includes an inference algorithm that iteratively updates a model until it converges. A major challenge is the scaling issue in parallelization owing to the fact that the model size is huge and parallel workers need to communicate the model continually. We identify three important features of the model in para...
متن کاملParallel LDA Through Synchronized Communication Optimizations
Sophisticated big data machine learning applications are difficult to parallelize because it not only needs to process a big training dataset, it also needs to synchronize big model data in iterations. In parallel LDA, comparing synchronized and asynchronous communication methods under data parallelism and model parallelism, we note that the power-law distribution of word counts in LDA training...
متن کاملClustering Social Images with MapReduce and High Performance Collective Communication
Social Image clustering is a data intensive application that provides novel challenges to high performance computing. Already this field has reached 10-100 million images represented as points in a high dimensional (up to 2048) vector space that are to be divided into up to 1-10 million clusters. In recent years MapReduce has become popular in processing big data problems due to its attractive ...
متن کاملOptimization of Collective Communication Operations in MPICH
We describe our work on improving the performance of collective communication operations in MPICH for clusters connected by switched networks. For each collective operation, we use multiple algorithms depending on the message size, with the goal of minimizing latency for short messages and minimizing bandwidth use for long messages. Although we have implemented new algorithms for all MPI (Messa...
متن کاملPerformance Analysis of DECK Collective Communication Service
Collective communication is very useful for parallel applications, especially those in which matrix and vector data structures need to be manipulated by a group of processes. This paper presents a performance analysis of collective communication primitives designed for the DECK parallel programming environment, with the aid of different numerical methods used to solve hydrodynamics and mass tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Procedia Computer Science
سال: 2016
ISSN: 1877-0509
DOI: 10.1016/j.procs.2016.05.300